Goto

Collaborating Authors

 Oneida County


Appendix A Details of Modeling

Neural Information Processing Systems

We retrieve top 10 passages and use them as input to mGEN. Gettysburg College, where he was a member of the Lambda Chi Alpha fraternity. We further subsample 50% of the synthetically generated questions. For our multilingual retriever, we split each article into 100-token chunks (Karpukhin et al., 2020), The original passage text file is 29GB, and the total index size is around 129 GB. Both two datasets are under the MIT licence.


Tendon-Actuated Concentric Tube Endonasal Robot (TACTER)

Yamamoto, Kent K., Zachem, Tanner J., Kheradmand, Pejman, Zheng, Patrick, Abdelgadir, Jihad, Bailey, Jared Laurance, Pieter, Kaelyn, Codd, Patrick J., Chitalia, Yash

arXiv.org Artificial Intelligence

Endoscopic endonasal approaches (EEA) have become more prevalent for minimally invasive skull base and sinus surgeries. However, rigid scopes and tools significantly decrease the surgeon's ability to operate in tight anatomical spaces and avoid critical structures such as the internal carotid artery and cranial nerves. This paper proposes a novel tendon-actuated concentric tube endonasal robot (TACTER) design in which two tendon-actuated robots are concentric to each other, resulting in an outer and inner robot that can bend independently. The outer robot is a unidirectionally asymmetric notch (UAN) nickel-titanium robot, and the inner robot is a 3D-printed bidirectional robot, with a nickel-titanium bending member. In addition, the inner robot can translate axially within the outer robot, allowing the tool to traverse through structures while bending, thereby executing follow-the-leader motion. A Cosserat-rod based mechanical model is proposed that uses tendon tension of both tendon-actuated robots and the relative translation between the robots as inputs and predicts the TACTER tip position for varying input parameters. The model is validated with experiments, and a human cadaver experiment is presented to demonstrate maneuverability from the nostril to the sphenoid sinus. This work presents the first tendon-actuated concentric tube (TACT) dexterous robotic tool capable of performing follow-the-leader motion within natural nasal orifices to cover workspaces typically required for a successful EEA.



PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Anderson, Carolyn Jane, Biswas, Joydeep, Boruch-Gruszecki, Aleksander, Cassano, Federico, Feldman, Molly Q, Guha, Arjun, Lucchetti, Francesca, Wu, Zixuan

arXiv.org Artificial Intelligence

Existing benchmarks for frontier models often test specialized, ``PhD-level'' knowledge that is difficult for non-experts to grasp. In contrast, we present a benchmark based on the NPR Sunday Puzzle Challenge that requires only general knowledge. Our benchmark is challenging for both humans and models, however correct solutions are easy to verify, and models' mistakes are easy to spot. Our work reveals capability gaps that are not evident in existing benchmarks: OpenAI o1 significantly outperforms other reasoning models that are on par on benchmarks that test specialized knowledge. Furthermore, our analysis of reasoning outputs uncovers new kinds of failures. DeepSeek R1, for instance, often concedes with ``I give up'' before providing an answer that it knows is wrong. R1 can also be remarkably ``uncertain'' in its output and in rare cases, it does not ``finish thinking,'' which suggests the need for an inference-time technique to ``wrap up'' before the context window limit is reached. We also quantify the effectiveness of reasoning longer with R1 and Gemini Thinking to identify the point beyond which more reasoning is unlikely to improve accuracy on our benchmark.


EmoXpt: Analyzing Emotional Variances in Human Comments and LLM-Generated Responses

Pyreddy, Shireesh Reddy, Zaman, Tarannum Shaila

arXiv.org Artificial Intelligence

The widespread adoption of generative AI has generated diverse opinions, with individuals expressing both support and criticism of its applications. This study investigates the emotional dynamics surrounding generative AI by analyzing human tweets referencing terms such as ChatGPT, OpenAI, Copilot, and LLMs. To further understand the emotional intelligence of ChatGPT, we examine its responses to selected tweets, highlighting differences in sentiment between human comments and LLM-generated responses. We introduce EmoXpt, a sentiment analysis framework designed to assess both human perspectives on generative AI and the sentiment embedded in ChatGPT's responses. Unlike prior studies that focus exclusively on human sentiment, EmoXpt uniquely evaluates the emotional expression of ChatGPT. Experimental results demonstrate that LLM-generated responses are notably more efficient, cohesive, and consistently positive than human responses.


WavePulse: Real-time Content Analytics of Radio Livestreams

Mittal, Govind, Gupta, Sarthak, Wagle, Shruti, Chopra, Chirag, DeMattee, Anthony J, Memon, Nasir, Ahamad, Mustaque, Hegde, Chinmay

arXiv.org Artificial Intelligence

Radio remains a pervasive medium for mass information dissemination, with AM/FM stations reaching more Americans than either smartphone-based social networking or live television. Increasingly, radio broadcasts are also streamed online and accessed over the Internet. We present WavePulse, a framework that records, documents, and analyzes radio content in real-time. While our framework is generally applicable, we showcase the efficacy of WavePulse in a collaborative project with a team of political scientists focusing on the 2024 Presidential Elections. We use WavePulse to monitor livestreams of 396 news radio stations over a period of three months, processing close to 500,000 hours of audio streams. These streams were converted into time-stamped, diarized transcripts and analyzed to track answer key political science questions at both the national and state levels. Our analysis revealed how local issues interacted with national trends, providing insights into information flow. Our results demonstrate WavePulse's efficacy in capturing and analyzing content from radio livestreams sourced from the Web. Code and dataset can be accessed at \url{https://wave-pulse.io}.


Towards Inclusive Face Recognition Through Synthetic Ethnicity Alteration

Chandaliya, Praveen Kumar, Raja, Kiran, Ramachandra, Raghavendra, Akhtar, Zahid, Busch, Christoph

arXiv.org Artificial Intelligence

Numerous studies have shown that existing Face Recognition Systems (FRS), including commercial ones, often exhibit biases toward certain ethnicities due to under-represented data. In this work, we explore ethnicity alteration and skin tone modification using synthetic face image generation methods to increase the diversity of datasets. We conduct a detailed analysis by first constructing a balanced face image dataset representing three ethnicities: Asian, Black, and Indian. We then make use of existing Generative Adversarial Network-based (GAN) image-to-image translation and manifold learning models to alter the ethnicity from one to another. A systematic analysis is further conducted to assess the suitability of such datasets for FRS by studying the realistic skin-tone representation using Individual Typology Angle (ITA). Further, we also analyze the quality characteristics using existing Face image quality assessment (FIQA) approaches. We then provide a holistic FRS performance analysis using four different systems. Our findings pave the way for future research works in (i) developing both specific ethnicity and general (any to any) ethnicity alteration models, (ii) expanding such approaches to create databases with diverse skin tones, (iii) creating datasets representing various ethnicities which further can help in mitigating bias while addressing privacy concerns.


5G Network Slicing: Analysis of Multiple Machine Learning Classifiers

Malkoc, Mirsad, Kholidy, Hisham A.

arXiv.org Artificial Intelligence

The division of one physical 5G communications infrastructure into several virtual network slices with distinct characteristics such as bandwidth, latency, reliability, security, and service quality is known as 5G network slicing. Each slice is a separate logical network that meets the requirements of specific services or use cases, such as virtual reality, gaming, autonomous vehicles, or industrial automation. The network slice can be adjusted dynamically to meet the changing demands of the service, resulting in a more cost-effective and efficient approach to delivering diverse services and applications over a shared infrastructure. This paper assesses various machine learning techniques, including the logistic regression model, linear discriminant model, k-nearest neighbor's model, decision tree model, random forest model, SVC BernoulliNB model, and GaussianNB model, to investigate the accuracy and precision of each model on detecting network slices. The report also gives an overview of 5G network slicing.


On-ramp and Off-ramp Traffic Flows Estimation Based on A Data-driven Transfer Learning Framework

Ma, Xiaobo, Karimpour, Abolfazl, Wu, Yao-Jan

arXiv.org Artificial Intelligence

To develop the most appropriate control strategy and monitor, maintain, and evaluate the traffic performance of the freeway weaving areas, state and local Departments of Transportation need to have access to traffic flows at each pair of on-ramp and off-ramp. However, ramp flows are not always readily available to transportation agencies and little effort has been made to estimate these missing flows in locations where no physical sensors are installed. To bridge this research gap, a data-driven framework is proposed that can accurately estimate the missing ramp flows by solely using data collected from loop detectors on freeway mainlines. The proposed framework employs a transfer learning model. The transfer learning model relaxes the assumption that the underlying data distributions of the source and target domains must be the same. Therefore, the proposed framework can guarantee high-accuracy estimation of on-ramp and off-ramp flows on freeways with different traffic patterns, distributions, and characteristics. Based on the experimental results, the flow estimation mean absolute errors range between 23.90 veh/h to 40.85 veh/h for on-ramps, and 31.58 veh/h to 45.31 veh/h for off-ramps; the flow estimation root mean square errors range between 34.55 veh/h to 57.77 veh/h for on-ramps, and 41.75 veh/h to 58.80 veh/h for off-ramps. Further, the comparison analysis shows that the proposed framework outperforms other conventional machine learning models. The estimated ramp flows based on the proposed method can help transportation agencies to enhance the operations of their ramp control strategies for locations where physical sensors are not installed.


A Semantic Framework for Enabling Radio Spectrum Policy Management and Evaluation

Santos, H., Mulvehill, A., Erickson, J. S., McCusker, J. P., Gordon, M., Xie, O., Stouffer, S., Capraro, G., Pidwerbetsky, A., Burgess, J., Berlinsky, A., Turck, K., Ashdown, J., McGuinness, D. L.

arXiv.org Artificial Intelligence

Because radio spectrum is a finite resource, its usage and sharing is regulated by government agencies. These agencies define policies to manage spectrum allocation and assignment across multiple organizations, systems, and devices. With more portions of the radio spectrum being licensed for commercial use, the importance of providing an increased level of automation when evaluating such policies becomes crucial for the efficiency and efficacy of spectrum management. We introduce our Dynamic Spectrum Access Policy Framework for supporting the United States government's mission to enable both federal and non-federal entities to compatibly utilize available spectrum. The DSA Policy Framework acts as a machine-readable policy repository providing policy management features and spectrum access request evaluation. The framework utilizes a novel policy representation using OWL and PROV-O along with a domain-specific reasoning implementation that mixes GeoSPARQL, OWL reasoning, and knowledge graph traversal to evaluate incoming spectrum access requests and explain how applicable policies were used. The framework is currently being used to support live, over-the-air field exercises involving a diverse set of federal and commercial radios, as a component of a prototype spectrum management system.